Instantaneous Speaker Adaptation Through Selection and Combination of fMLLR Transformation Matrices

نویسندگان

  • Diego Giuliani
  • Fabio Brugnara
چکیده

This paper addresses instantaneous speaker adaptation, based on feature-space maximum likelihood linear regression (fMLLR), in the context of an automatic transcription task. We investigate the use of fMLLR-based adaptation when the need of a preliminary decoding pass for a speech segment is removed, as sufficient statistics for adaptation parameter estimation are gathered with respect to a Gaussian mixture model. To cope with limited adaptation data, in addition of using feature-space maximum a posteriori linear regression (fMAPLR), an investigation is conducted where the transformation matrix to be applied to the speech segment is estimated through selection and combination of pre-computed fMLLR transformation matrices. For speaker adaptively trained acoustic models results of recognition experiments show that the proposed approach is moderately better than fMLLR but not as good as fMAPLR.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Online Speaker Adaptation with Pre-Computed FMLLR Transformations

This paper presents a memory efficient single pass speech recognizer that makes use of pre-computed FMLLR transformations for online speaker adaptation. For that purpose we apply unsupervised segment clustering to the training corpus, create a transformation matrix for each cluster, and train a text-independentGaussian mixture classifier for cluster selection during runtime. We use the RWTH Aac...

متن کامل

Feature and model space speaker adaptation with full covariance Gaussians

Full covariance models can give better results for speech recognition than diagonal models, yet they introduce complications for standard speaker adaptation techniques such as MLLR and fMLLR. Here we introduce efficient update methods to train adaptation matrices for the full covariance case. We also experiment with a simplified technique in which we pretend that the full covariance Gaussians a...

متن کامل

Feature and model space speaker adaptati

Full covariance models can give better results for speech recognition than diagonal models, yet they introduce complications for standard speaker adaptation techniques such as MLLR and fMLLR. Here we introduce efficient update methods to train adaptation matrices for the full covariance case. We also experiment with a simplified technique in which we pretend that the full covariance Gaussians a...

متن کامل

Preliminary Work on Speaker Adaptation for Dnn-based Speech Synthesis

We investigate speaker adaptation in the context of deep neural network (DNN) based speech synthesis. More specifically, our current work focuses on the exploitation of auxiliary information such as gender, speaker identity or age during the DNN training process. The proposed technique is compared to standard acoustic feature transformations such as the feature based maximum likelihood linear r...

متن کامل

A fast maximum likelihood nonlinear feature transformation method for GMM-HMM speaker adaptation

We describe a novel maximum likelihood nonlinear feature bias compensation method for Gaussian mixture model–hidden Markov model (GMM–HMM) adaptation. Our approach exploits a single-hiddenlayer neural network (SHLNN) that, similar to the extreme learning machine (ELM), uses randomly generated lower-layer weights and linear output units. Different from the conventional ELM, however, our approach...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011